The Circular Drift Diffusion Model

The circular drift-diffusion model (CDDM) is a stochastic sequential sampling model that describes choices and response times observed in tasks with a circular decision space (Smith, 2016). Like many other accumulator models, the CDDM assumes that participants accumulate information in favor of a given response option over time, until a response criteria is met. This is characterized as a random walk that starts at the origin of a circle representing the decision space, and that moves towards its circumference. Once the circumference is reached, the radian corresponding to the point of intersection and the amount of steps it took to get there are indicative of the choice and response times observed.

The CDDM considers four parameters, most of which are illustrated in Fig 1. The parameter not shown is the nondecision time parameter \(\tau\). The remaining parameters are the boundary radius parameter (\(\eta\)) that serves as a response criterion, and a doublet of parameters describing the overall speed and direction of the information accumulation process. These last two parameters can be expressed in cartesian coordinates (i.e., the mean displacement observed on the X and Y coordinates per unit of time \(\{\mu_x, \mu_y\}\)) or polar coordinates (i.e., the average speed and expected direction \(\{\delta,\theta\}\)).

Fig 1. Graphical illustration of the CDDM

Fig 1. Graphical illustration of the CDDM

dCDDM(*) - CDDM density function

pCDDM(*) - CDDM cumulate density function

Illustrative parameter set

In this document, we present a comparison between different sampling algorithms that generate data under the CDDM. We will describe how each of these algorithms work and highlight how different they are in terms of the computation time they require and their accuracy. For starters, we will generate n = 5000 bivariate observations using the arbitrary set of parameter values shown below:

# Arbitrary set of parameter values
par <- list("drift" = 3.5, 
            "theta" = pi,
            "tzero" = 0.2,
            "boundary" = 2)
n <- 5000 # No. samples

Algorithm 1: Random walk emulator

Each pair of observations is obtained by emulating the random walk process described by the CDDM. We emulate the information accumulation process by iteratively sampling random movements on the X and Y direction using \(\mu_x\) and \(\mu_y\), until the

source("./code/cddm/sim_randomWalk.R")
X.RW <- sample.RW.cddm(n,par)
plot.CDDM(X.RW, par, choice.col.RGB = rgb.RW)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.RW$bivariate.data[,2], main = "Response Times", col = col.RW(0.7))

The execution of this first algorithm took approximately 15.3942 seconds.

Algorithm 2: Rejection sampling algorithm

max.RT <- max(X.RW$bivariate.data[,2])
source("./code/cddm/sim_Reject.R")
X.Reject <- sample.Reject.cddm(n,par,max.RT,plot=TRUE)
Fig.2 Visual representation of the Reject sampling algorithm. The blue lines represent the joint density function of response times and choices prescribed by the CDDM. The dots represent the pairs of candidate value generated from the bivariate space of possible RTs and choices, plotted at the height of the random uniform(0, maxDensity) value used as rejection criterion. If the rejection criterion was lower than the density function (i.e., dot falls below density curve), the sample is accepted; if the rejection value surpasses the density (i.e., dot is drawn above), the candidate is rejected. The process of generating, testing and keeping candidates is repeated until the desired sample size is achieved.

Fig.2 Visual representation of the Reject sampling algorithm. The blue lines represent the joint density function of response times and choices prescribed by the CDDM. The dots represent the pairs of candidate value generated from the bivariate space of possible RTs and choices, plotted at the height of the random uniform(0, maxDensity) value used as rejection criterion. If the rejection criterion was lower than the density function (i.e., dot falls below density curve), the sample is accepted; if the rejection value surpasses the density (i.e., dot is drawn above), the candidate is rejected. The process of generating, testing and keeping candidates is repeated until the desired sample size is achieved.

The execution of this second algorithm took approximately 5.6096 seconds.

plot.CDDM(X.Reject, par, choice.col.RGB = rgb.Reject)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.Reject[,2], main = "Response Times", col=col.Reject(0.7))

Algorithm 3: Metropolis algorithm

The following Metropolis algorithm uses the density function to generate random observations under the CDDM. Please read the comments I’ve left through the code to get a better idea on how this algorithm works. In order to work, this algorithm only needs a list par that specifies the values to use for each of the four parameters of the model (in either of its parameterizations, using polar or cartesian coordinates), and an arbitrary value max.RT that indicates the maximum possible R.T we’d expect to observe under a realistic scenario.

source("./code/cddm/sim_Metropolis.R")
X.Metro <- sample.Metropolis.cddm(n,par,max.RT,plot=TRUE)

The execution of this second algorithm took approximately 3.9742 seconds.

plot.CDDM(X.Metro, par, choice.col.RGB = rgb.Metro)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.Metro[,2], main = "Response Times", col=col.Metro(0.7))

Algorithm 4: Inverse CDF (tabloid approximation)

source("./code/cddm/sim_invCDF.R")
X.invCDF <- sample.invCDF.cddm(n, par, max.RT = max.RT)
plot.CDDM(X.invCDF, par, choice.col.RGB = rgb.invCDF)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.invCDF[,2], main = "Response Times", col = col.invCDF(0.7))

The execution of this first algorithm took approximately 23.1327 seconds.

Accuracy testing

Step 1: Obtain empirical CDF (eCDF)

# Load file containing custom eCDF function
source("./code/general_functions/eCDF.R")

RW.eCDF <- myECDF(X.RW$bivariate.data)
Reject.eCDF <- myECDF(X.Reject)
Metro.eCDF  <- myECDF(X.Metro)
invCDF.eCDF <- myECDF(X.invCDF)

Step 2: Approximate the theoretical CDF (tCDF)

# Approximate the theoretical CDF for the Random Walk data
start_time <- Sys.time()
RW.tCDF <- pCDDM(X.RW$bivariate.data, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
RW_tCDF.time <- end_time-start_time
RW_tCDF.time <- round(RW_tCDF.time,4)

Obtaining the approximate CDF for the Random-Walk data with a trapezoid algorithm took 4.373 seconds.

# Approximate the theoretical CDF for the Reject data
start_time <- Sys.time()
Reject.tCDF <- pCDDM(X.Reject, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
MC_tCDF.time <- end_time-start_time
MC_tCDF.time <- round(MC_tCDF.time,4)

Obtaining the approximate CDF for the data generated with the Rejection sampling algorithm took 4.6844 seconds.

# Approximate the theoretical CDF for the Reject data
start_time <- Sys.time()
invCDF.tCDF <- pCDDM(X.invCDF, par$drift, par$theta, par$tzero, par$boundary)
end_time <- Sys.time()
invCDF_tCDF.time <- end_time-start_time
invCDF_tCDF.time <- round(invCDF_tCDF.time,4)

Obtaining the approximate CDF for the data generated with the Rejection sampling algorithm took 4.259 seconds.

Step 3: Compare eCDFs vs approximate tCDF

# We build a simple function to compare these CDFs
getDifferences <- function(eCDF,tCDF){
  difference <- tCDF - eCDF
  difference.sum <- sum(difference)
  KS_statistic <- max(abs(difference)) # Kolmogorov–Smirnov statistic
  sq.difference <- sum((difference)^2)
  
  output <- round(cbind(difference.sum,KS_statistic,sq.difference),4)
  colnames(output) <- c("sumDiff","KS-statistic","SSDiff") 
  return(output)
}
getDifferences(RW.eCDF,RW.tCDF)
##      sumDiff KS-statistic SSDiff
## [1,]  8.3044       0.0215 0.1574
getDifferences(Reject.eCDF,Reject.tCDF)
##       sumDiff KS-statistic SSDiff
## [1,] -10.4874       0.0171  0.114

Comparing computation times

compareTime <- function(n.Samples, n.Datasets){
    times.RW   <- matrix(NA, nrow=n.Datasets, ncol=length(n.Samples))
    times.Reject <- matrix(NA, nrow=n.Datasets, ncol=length(n.Samples))
    for(m in 1:n.Datasets){ 
        i <- 1
        for(n in n.Samples){
            start_time <- Sys.time()
            sample.RW.cddm(n,par)
            end_time <- Sys.time()
            times.RW[m,i] <- round(end_time-start_time,4)
            start_time <- Sys.time()
            sample.Reject.cddm(n,par,max.RT,plot=FALSE)
            end_time <- Sys.time()
            times.Reject[m,i] <- round(end_time-start_time, 4)
            i <- i+1
        }
    }
return(list("times.RW" = times.RW,
            "times.Reject" = times.Reject))
}

Specific results

The Reject algorithm massively outperforms the Random Walk algorithm when the parameter values used for \(\delta\) and \(\eta\) imply a slower or longer walk.

# Arbitrary set of parameter values
par <- list("drift" = 1,   
            "theta" = pi,
            "tzero" = 0.1,
            "boundary" = 7)
n <- 5000 # No. samples

getDifferences(RW.eCDF,RW.tCDF)
##      sumDiff KS-statistic SSDiff
## [1,] 19.9724       0.0211 0.2308
getDifferences(Reject.eCDF,Reject.tCDF)
##      sumDiff KS-statistic SSDiff
## [1,]   8.175       0.0169 0.0711
getDifferences(invCDF.eCDF,invCDF.tCDF)
##       sumDiff KS-statistic   SSDiff
## [1,] 1454.061       0.9991 554.0573

References

  • Smith, P. L. (2016). Diffusion theory of decision making in continuous report. Psychological Review, 123(4), 425.
  • Smith, P. L., Garrett, P. M., & Zhou, J. (2023). Obtaining stable predicted distributions of response times and decision outcomes for the circular diffusion model. Computational Brain & Behavior, 6(4), 543-555.